Overview

Dataset statistics

Number of variables15
Number of observations2000
Missing cells1892
Missing cells (%)6.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory234.5 KiB
Average record size in memory120.1 B

Variable types

Numeric8
Categorical7

Warnings

Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Chronic_kidney_disease and 1 other fieldsHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Blood_Pressure_Abnormality is highly correlated with Adrenal_and_thyroid_disorders and 3 other fieldsHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Genetic_Pedigree_Coefficient is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Sex is highly correlated with Level_of_HemoglobinHigh correlation
Level_of_Hemoglobin is highly correlated with Blood_Pressure_Abnormality and 1 other fieldsHigh correlation
Pregnancy is highly correlated with SexHigh correlation
Blood_Pressure_Abnormality is highly correlated with Adrenal_and_thyroid_disorders and 1 other fieldsHigh correlation
Adrenal_and_thyroid_disorders is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Chronic_kidney_disease is highly correlated with Blood_Pressure_AbnormalityHigh correlation
Sex is highly correlated with PregnancyHigh correlation
Genetic_Pedigree_Coefficient has 92 (4.6%) missing values Missing
Pregnancy has 1558 (77.9%) missing values Missing
alcohol_consumption_per_day has 242 (12.1%) missing values Missing
Patient_Number is uniformly distributed Uniform
Patient_Number has unique values Unique

Reproduction

Analysis started2021-08-08 17:53:19.940701
Analysis finished2021-08-08 17:53:27.731653
Duration7.79 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Patient_Number
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct2000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1000.5
Minimum1
Maximum2000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:27.816320image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile100.95
Q1500.75
median1000.5
Q31500.25
95-th percentile1900.05
Maximum2000
Range1999
Interquartile range (IQR)999.5

Descriptive statistics

Standard deviation577.4945887
Coefficient of variation (CV)0.5772059857
Kurtosis-1.2
Mean1000.5
Median Absolute Deviation (MAD)500
Skewness0
Sum2001000
Variance333500
MonotonicityStrictly increasing
2021-08-08T23:23:27.991771image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21
 
0.1%
6591
 
0.1%
6851
 
0.1%
6831
 
0.1%
6811
 
0.1%
6791
 
0.1%
6771
 
0.1%
6751
 
0.1%
6731
 
0.1%
6711
 
0.1%
Other values (1990)1990
99.5%
ValueCountFrequency (%)
11
0.1%
21
0.1%
31
0.1%
41
0.1%
51
0.1%
61
0.1%
71
0.1%
81
0.1%
91
0.1%
101
0.1%
ValueCountFrequency (%)
20001
0.1%
19991
0.1%
19981
0.1%
19971
0.1%
19961
0.1%
19951
0.1%
19941
0.1%
19931
0.1%
19921
0.1%
19911
0.1%

Blood_Pressure_Abnormality
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1013 
1
987 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Length

2021-08-08T23:23:28.157546image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:28.207654image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Most occurring characters

ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01013
50.6%
1987
49.4%

Level_of_Hemoglobin
Real number (ℝ≥0)

HIGH CORRELATION

Distinct757
Distinct (%)37.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.710035
Minimum8.1
Maximum17.56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:28.274290image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum8.1
5-th percentile8.58
Q110.1475
median11.33
Q312.945
95-th percentile16.01
Maximum17.56
Range9.46
Interquartile range (IQR)2.7975

Descriptive statistics

Standard deviation2.186700638
Coefficient of variation (CV)0.1867373272
Kurtosis-0.1842879759
Mean11.710035
Median Absolute Deviation (MAD)1.36
Skewness0.6570660942
Sum23420.07
Variance4.781659679
MonotonicityNot monotonic
2021-08-08T23:23:28.384564image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12.0711
 
0.5%
11.5810
 
0.5%
11.549
 
0.4%
11.958
 
0.4%
10.898
 
0.4%
11.198
 
0.4%
10.988
 
0.4%
10.388
 
0.4%
10.558
 
0.4%
11.168
 
0.4%
Other values (747)1914
95.7%
ValueCountFrequency (%)
8.12
0.1%
8.121
 
0.1%
8.134
0.2%
8.152
0.1%
8.162
0.1%
8.174
0.2%
8.183
0.1%
8.191
 
0.1%
8.22
0.1%
8.211
 
0.1%
ValueCountFrequency (%)
17.561
 
0.1%
17.541
 
0.1%
17.531
 
0.1%
17.522
0.1%
17.511
 
0.1%
17.481
 
0.1%
17.451
 
0.1%
17.443
0.1%
17.391
 
0.1%
17.351
 
0.1%

Genetic_Pedigree_Coefficient
Real number (ℝ≥0)

HIGH CORRELATION
MISSING

Distinct101
Distinct (%)5.3%
Missing92
Missing (%)4.6%
Infinite0
Infinite (%)0.0%
Mean0.4948165618
Minimum0
Maximum1
Zeros17
Zeros (%)0.9%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:28.505720image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.04
Q10.24
median0.49
Q30.74
95-th percentile0.9565
Maximum1
Range1
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation0.2917358818
Coefficient of variation (CV)0.5895839071
Kurtosis-1.17856276
Mean0.4948165618
Median Absolute Deviation (MAD)0.25
Skewness0.01517745777
Sum944.11
Variance0.08510982475
MonotonicityNot monotonic
2021-08-08T23:23:28.641263image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.8632
 
1.6%
0.1330
 
1.5%
0.6328
 
1.4%
0.5627
 
1.4%
0.1727
 
1.4%
0.9926
 
1.3%
0.2525
 
1.2%
0.0625
 
1.2%
0.4625
 
1.2%
0.9525
 
1.2%
Other values (91)1638
81.9%
(Missing)92
 
4.6%
ValueCountFrequency (%)
017
0.9%
0.0123
1.1%
0.0224
1.2%
0.0317
0.9%
0.0423
1.1%
0.0515
0.8%
0.0625
1.2%
0.0711
0.5%
0.0821
1.1%
0.0921
1.1%
ValueCountFrequency (%)
118
0.9%
0.9926
1.3%
0.9819
0.9%
0.9718
0.9%
0.9615
0.8%
0.9525
1.2%
0.9418
0.9%
0.9313
0.7%
0.9221
1.1%
0.9111
0.5%

Age
Real number (ℝ≥0)

Distinct58
Distinct (%)2.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean46.5585
Minimum18
Maximum75
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:28.748254image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile20
Q132
median46
Q362
95-th percentile73
Maximum75
Range57
Interquartile range (IQR)30

Descriptive statistics

Standard deviation17.10783203
Coefficient of variation (CV)0.3674480928
Kurtosis-1.248231524
Mean46.5585
Median Absolute Deviation (MAD)15
Skewness0.02117832032
Sum93117
Variance292.6779167
MonotonicityNot monotonic
2021-08-08T23:23:28.856749image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1846
 
2.3%
7245
 
2.2%
2143
 
2.1%
7143
 
2.1%
2541
 
2.1%
6941
 
2.1%
5341
 
2.1%
3940
 
2.0%
2940
 
2.0%
4940
 
2.0%
Other values (48)1580
79.0%
ValueCountFrequency (%)
1846
2.3%
1927
1.4%
2029
1.5%
2143
2.1%
2234
1.7%
2327
1.4%
2435
1.8%
2541
2.1%
2634
1.7%
2737
1.8%
ValueCountFrequency (%)
7536
1.8%
7439
1.9%
7337
1.8%
7245
2.2%
7143
2.1%
7032
1.6%
6941
2.1%
6838
1.9%
6732
1.6%
6634
1.7%

BMI
Real number (ℝ≥0)

Distinct41
Distinct (%)2.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.0815
Minimum10
Maximum50
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:28.965145image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile11
Q120
median30
Q340
95-th percentile48
Maximum50
Range40
Interquartile range (IQR)20

Descriptive statistics

Standard deviation11.7612083
Coefficient of variation (CV)0.3909781196
Kurtosis-1.182620943
Mean30.0815
Median Absolute Deviation (MAD)10
Skewness-0.0175554741
Sum60163
Variance138.3260208
MonotonicityNot monotonic
2021-08-08T23:23:29.069116image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
1162
 
3.1%
3862
 
3.1%
3459
 
2.9%
2659
 
2.9%
4157
 
2.9%
2157
 
2.9%
2054
 
2.7%
4053
 
2.6%
3553
 
2.6%
1552
 
2.6%
Other values (31)1432
71.6%
ValueCountFrequency (%)
1048
2.4%
1162
3.1%
1242
2.1%
1338
1.9%
1444
2.2%
1552
2.6%
1647
2.4%
1739
1.9%
1849
2.5%
1945
2.2%
ValueCountFrequency (%)
5050
2.5%
4946
2.3%
4845
2.2%
4749
2.5%
4651
2.5%
4545
2.2%
4441
2.1%
4352
2.6%
4250
2.5%
4157
2.9%

Sex
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1008 
1
992 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Length

2021-08-08T23:23:29.277530image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:29.572529image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Most occurring characters

ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01008
50.4%
1992
49.6%

Pregnancy
Categorical

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)0.5%
Missing1558
Missing (%)77.9%
Memory size15.8 KiB
0.0
243 
1.0
199 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters1326
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0

Common Values

ValueCountFrequency (%)
0.0243
 
12.2%
1.0199
 
10.0%
(Missing)1558
77.9%

Length

2021-08-08T23:23:29.701626image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:29.747661image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
0.0243
55.0%
1.0199
45.0%

Most occurring characters

ValueCountFrequency (%)
0685
51.7%
.442
33.3%
1199
 
15.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number884
66.7%
Other Punctuation442
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0685
77.5%
1199
 
22.5%
Other Punctuation
ValueCountFrequency (%)
.442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1326
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0685
51.7%
.442
33.3%
1199
 
15.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1326
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0685
51.7%
.442
33.3%
1199
 
15.0%

Smoking
Categorical

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
1
1019 
0
981 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Length

2021-08-08T23:23:29.880591image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:29.929838image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring characters

ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11019
50.9%
0981
49.0%

Physical_activity
Real number (ℝ≥0)

Distinct1951
Distinct (%)97.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean25254.4245
Minimum628
Maximum49980
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:30.023993image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum628
5-th percentile3141.75
Q113605.75
median25353
Q337382.25
95-th percentile47170.2
Maximum49980
Range49352
Interquartile range (IQR)23776.5

Descriptive statistics

Standard deviation14015.43962
Coefficient of variation (CV)0.5549696697
Kurtosis-1.161726861
Mean25254.4245
Median Absolute Deviation (MAD)11893.5
Skewness-0.01055936725
Sum50508849
Variance196432547.8
MonotonicityNot monotonic
2021-08-08T23:23:30.128014image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
329032
 
0.1%
45912
 
0.1%
295132
 
0.1%
186292
 
0.1%
29712
 
0.1%
400462
 
0.1%
387692
 
0.1%
56732
 
0.1%
226642
 
0.1%
144792
 
0.1%
Other values (1941)1980
99.0%
ValueCountFrequency (%)
6281
0.1%
7451
0.1%
7682
0.1%
7741
0.1%
7841
0.1%
7911
0.1%
7991
0.1%
8141
0.1%
8291
0.1%
8471
0.1%
ValueCountFrequency (%)
499801
0.1%
499401
0.1%
499261
0.1%
499151
0.1%
498061
0.1%
497831
0.1%
497591
0.1%
496821
0.1%
496711
0.1%
496651
0.1%

salt_content_in_the_diet
Real number (ℝ≥0)

Distinct1945
Distinct (%)97.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean24926.097
Minimum22
Maximum49976
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:30.228750image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile2462.1
Q113151.75
median25046.5
Q336839.75
95-th percentile47202.25
Maximum49976
Range49954
Interquartile range (IQR)23688

Descriptive statistics

Standard deviation14211.69259
Coefficient of variation (CV)0.5701531446
Kurtosis-1.154963837
Mean24926.097
Median Absolute Deviation (MAD)11813
Skewness-0.02179784832
Sum49852194
Variance201972206.2
MonotonicityNot monotonic
2021-08-08T23:23:30.342178image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
263532
 
0.1%
293242
 
0.1%
75452
 
0.1%
489952
 
0.1%
352212
 
0.1%
285172
 
0.1%
229422
 
0.1%
264742
 
0.1%
313662
 
0.1%
382652
 
0.1%
Other values (1935)1980
99.0%
ValueCountFrequency (%)
221
0.1%
441
0.1%
581
0.1%
621
0.1%
661
0.1%
1051
0.1%
1441
0.1%
1501
0.1%
1541
0.1%
1611
0.1%
ValueCountFrequency (%)
499761
0.1%
499561
0.1%
498461
0.1%
498001
0.1%
497781
0.1%
497101
0.1%
497001
0.1%
496441
0.1%
496421
0.1%
496261
0.1%

alcohol_consumption_per_day
Real number (ℝ≥0)

MISSING

Distinct488
Distinct (%)27.8%
Missing242
Missing (%)12.1%
Infinite0
Infinite (%)0.0%
Mean251.0085324
Minimum0
Maximum499
Zeros9
Zeros (%)0.4%
Negative0
Negative (%)0.0%
Memory size15.8 KiB
2021-08-08T23:23:30.455949image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile28.85
Q1126.25
median250
Q3377.75
95-th percentile473.15
Maximum499
Range499
Interquartile range (IQR)251.5

Descriptive statistics

Standard deviation143.6518844
Coefficient of variation (CV)0.5722988101
Kurtosis-1.217678643
Mean251.0085324
Median Absolute Deviation (MAD)126
Skewness-0.008259128943
Sum441273
Variance20635.8639
MonotonicityNot monotonic
2021-08-08T23:23:30.575022image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
25311
 
0.5%
30210
 
0.5%
14410
 
0.5%
40110
 
0.5%
3479
 
0.4%
09
 
0.4%
4859
 
0.4%
4468
 
0.4%
2068
 
0.4%
1808
 
0.4%
Other values (478)1666
83.3%
(Missing)242
 
12.1%
ValueCountFrequency (%)
09
0.4%
13
 
0.1%
23
 
0.1%
35
0.2%
42
 
0.1%
54
0.2%
65
0.2%
84
0.2%
93
 
0.1%
114
0.2%
ValueCountFrequency (%)
4992
 
0.1%
4971
 
0.1%
4963
0.1%
4955
0.2%
4944
0.2%
4931
 
0.1%
4923
0.1%
4914
0.2%
4903
0.1%
4884
0.2%

Level_of_Stress
Categorical

Distinct3
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
3
691 
1
666 
2
643 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row1
5th row2

Common Values

ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Length

2021-08-08T23:23:30.756836image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:30.813570image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Most occurring characters

ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3691
34.5%
1666
33.3%
2643
32.1%

Chronic_kidney_disease
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1287 
1
713 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Length

2021-08-08T23:23:30.961489image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:31.021566image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Most occurring characters

ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01287
64.3%
1713
35.6%

Adrenal_and_thyroid_disorders
Categorical

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.8 KiB
0
1404 
1
596 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Length

2021-08-08T23:23:31.198442image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-08-08T23:23:31.254472image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Most occurring characters

ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Most occurring scripts

ValueCountFrequency (%)
Common2000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII2000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
01404
70.2%
1596
29.8%

Interactions

2021-08-08T23:23:21.141271image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.241742image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.329951image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.424017image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.590386image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.735195image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.860026image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:21.949270image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.038677image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.124708image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.205667image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.289366image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.370920image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.457328image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.540295image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.640827image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.725041image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.813837image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.897010image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:22.979992image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.060223image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.145959image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.227759image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.309483image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.391964image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.476327image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.555860image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.636579image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.713846image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.797048image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.876285image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:23.956306image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.033577image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.125858image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.213192image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.300261image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.425848image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.515993image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.602563image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.689955image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:24.777185image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.053534image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.146627image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.227581image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.305907image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.389266image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.468561image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.548825image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.629170image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.714749image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.795038image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.875757image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:25.954641image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.038775image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.119090image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.200054image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.281086image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.366251image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.446236image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.529883image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.607569image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.692495image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.773304image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
2021-08-08T23:23:26.853589image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Correlations

2021-08-08T23:23:31.321463image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-08-08T23:23:31.483225image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-08-08T23:23:31.643565image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-08-08T23:23:31.804422image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-08-08T23:23:31.947744image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-08-08T23:23:27.032078image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
A simple visualization of nullity by column.
2021-08-08T23:23:27.342611image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-08-08T23:23:27.550019image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-08-08T23:23:27.623929image/svg+xmlMatplotlib v3.4.1, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Patient_NumberBlood_Pressure_AbnormalityLevel_of_HemoglobinGenetic_Pedigree_CoefficientAgeBMISexPregnancySmokingPhysical_activitysalt_content_in_the_dietalcohol_consumption_per_dayLevel_of_StressChronic_kidney_diseaseAdrenal_and_thyroid_disorders
01111.280.90342311.004596148071NaN211
1209.750.2354331NaN02610625333205.0300
23110.790.9170490NaN099952946567.0210
34011.000.4371500NaN0106357439242.0100
45114.170.8352190NaN01561949644397.0200
56011.640.5423480NaN1270427513NaN300
67111.690.75434111.003836932967206.0311
78012.700.4148200NaN02978126749134.0200
89010.880.6872440NaN0814960799.0300
910114.560.6140440NaN012781271595.0200

Last rows

Patient_NumberBlood_Pressure_AbnormalityLevel_of_HemoglobinGenetic_Pedigree_CoefficientAgeBMISexPregnancySmokingPhysical_activitysalt_content_in_the_dietalcohol_consumption_per_dayLevel_of_StressChronic_kidney_diseaseAdrenal_and_thyroid_disorders
19901991111.210.0163250NaN132903454050.0300
19911992115.530.1222240NaN04832516514NaN211
1992199319.380.4960391NaN14659129557125.0111
1993199409.691.0073421NaN1433443623048.0300
19941995011.070.6658311NaN03860322836379.0200
19951996110.140.0269261NaN12611847568144.0310
19961997111.771.00244511.0125728063NaN311
19971998116.910.2218420NaN01493324753NaN211
19981999011.150.7246451NaN11815715275253.0300
19992000111.360.0941450NaN02072930463230.0110